AITopics | new framework

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

Neural Information Processing SystemsMar-17-2026, 12:31:36 GMT

Statistical performance bounds for reinforcement learning (RL) algorithms can be critical for high-stakes applications like healthcare. This paper introduces a new framework for theoretically measuring the performance of such algorithms called Uniform-PAC, which is a strengthening of the classical Probably Approximately Correct (PAC) framework. In contrast to the PAC framework, the uniform version may be used to derive high probability regret guarantees and so forms a bridge between the two setups that has been missing in the literature. We demonstrate the benefits of the new framework for finite-state episodic MDPs with a new algorithm that is Uniform-PAC and simultaneously achieves optimal regret and PAC guarantees except for a factor of the horizon.

artificial intelligence, proceedings, reinforcement learning, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

1ee942c6b182d0f041a2312947385b23-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 17:46:32 GMT

privacy model, public data, sample complexity, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Neural Information Processing SystemsDec-24-2025, 04:58:34 GMT

Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from a fixed dataset without active data collection. Based on the composition of the offline dataset, two main methods are used: imitation learning which is suitable for expert datasets, and vanilla offline RL which often requires uniform coverage datasets. From a practical standpoint, datasets often deviate from these two extremes and the exact data composition is usually unknown. To bridge this gap, we present a new offline RL framework that smoothly interpolates between the two extremes of data composition, hence unifying imitation learning and vanilla offline RL. The new framework is centered around a weak version of the concentrability coefficient that measures the deviation of the behavior policy from the expert policy alone.

bridging offline reinforcement learning, offline rl, reinforcement learning and imitation learning, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

Neural Information Processing SystemsNov-21-2025, 14:26:20 GMT

Statistical performance bounds for reinforcement learning (RL) algorithms can be critical for high-stakes applications like healthcare. This paper introduces a new framework for theoretically measuring the performance of such algorithms called Uniform-PAC, which is a strengthening of the classical Probably Approximately Correct (PAC) framework. In contrast to the PAC framework, the uniform version may be used to derive high probability regret guarantees and so forms a bridge between the two setups that has been missing in the literature. We demonstrate the benefits of the new framework for finite-state episodic MDPs with a new algorithm that is Uniform-PAC and simultaneously achieves optimal regret and PAC guarantees except for a factor of the horizon.

episodic reinforcement learning, uniform pac bound, unifying pac and regret, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.66)

Add feedback

Denoising Diffusion as a New Framework for Underwater Images

Jain, Nilesh, Alhajjar, Elie

arXiv.org Artificial IntelligenceOct-14-2025

Underwater images play a crucial role in ocean research and marine environmental monitoring since they provide quality information about the ecosystem. However, the complex and remote nature of the environment results in poor image quality with issues such as low visibility, blurry textures, color distortion, and noise. In recent years, research in image enhancement has proven to be effective but also presents its own limitations, like poor generalization and heavy reliance on clean datasets. One of the challenges herein is the lack of diversity and the low quality of images included in these datasets. Also, most existing datasets consist only of monocular images, a fact that limits the representation of different lighting conditions and angles. In this paper, we propose a new plan of action to overcome these limitations. On one hand, we call for expanding the datasets using a denoising diffusion model to include a variety of image types such as stereo, wide-angled, macro, and close-up images. On the other hand, we recommend enhancing the images using Controlnet to evaluate and increase the quality of the corresponding datasets, and hence improve the study of the marine ecosystem. Tags - Underwater Images, Denoising Diffusion, Marine ecosystem, Controlnet

artificial intelligence, diffusion model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2510.09934

Country:

North America > United States (0.04)
Africa > South Africa > Gauteng > Johannesburg (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

607bc9ebe4abfcd65181bfbef6252830-AuthorFeedback.pdf

Neural Information Processing SystemsOct-9-2025, 14:34:45 GMT

artificial intelligence, heavy-tailed perturbation, perturbation, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.31)

Add feedback

1ee942c6b182d0f041a2312947385b23-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 09:44:21 GMT

artificial intelligence, machine learning, sample complexity, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.30)

Add feedback

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Neural Information Processing SystemsMay-26-2025, 20:43:37 GMT

Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from a fixed dataset without active data collection. Based on the composition of the offline dataset, two main methods are used: imitation learning which is suitable for expert datasets, and vanilla offline RL which often requires uniform coverage datasets. From a practical standpoint, datasets often deviate from these two extremes and the exact data composition is usually unknown. To bridge this gap, we present a new offline RL framework that smoothly interpolates between the two extremes of data composition, hence unifying imitation learning and vanilla offline RL. The new framework is centered around a weak version of the concentrability coefficient that measures the deviation of the behavior policy from the expert policy alone.

artificial intelligence, machine learning, offline rl, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Generative AI, online platforms and compensation for content: the need for a new framework

AIHubFeb-24-2025, 09:56:59 GMT

The emergence of generative artificial intelligence has put the issue of compensation for content producers back on the table. Generative AI offers undeniable benefits but raises familiar fears tied to disruptive technologies. Legal battles are already emerging worldwide, with intellectual property owners and AI developers clashing over rights. Alongside these legal and ethical concerns lies the economic question: how should revenues generated by AI be fairly distributed? Individual contributions to AI-generated outputs are often too complex to quantify, making it difficult to apply the principle of proportional remuneration, which holds that payment for an individual work is tied to the revenue it generates.

machine learning, natural language, platform, (15 more...)

AIHub

Genre: Personal > Honors (0.31)

Industry: Law > Intellectual Property & Technology Law (0.52)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.93)

Add feedback

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Neural Information Processing SystemsOct-10-2024, 17:51:05 GMT

Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from a fixed dataset without active data collection. Based on the composition of the offline dataset, two main methods are used: imitation learning which is suitable for expert datasets, and vanilla offline RL which often requires uniform coverage datasets. From a practical standpoint, datasets often deviate from these two extremes and the exact data composition is usually unknown. To bridge this gap, we present a new offline RL framework that smoothly interpolates between the two extremes of data composition, hence unifying imitation learning and vanilla offline RL. The new framework is centered around a weak version of the concentrability coefficient that measures the deviation of the behavior policy from the expert policy alone.

bridging offline reinforcement learning, offline rl, reinforcement learning and imitation learning, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Collaborating Authors

new framework

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

1ee942c6b182d0f041a2312947385b23-AuthorFeedback.pdf

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Unifying PAC and Regret: Uniform PAC Bounds for Episodic Reinforcement Learning

Denoising Diffusion as a New Framework for Underwater Images

607bc9ebe4abfcd65181bfbef6252830-AuthorFeedback.pdf

1ee942c6b182d0f041a2312947385b23-AuthorFeedback.pdf

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism

Generative AI, online platforms and compensation for content: the need for a new framework

Bridging Offline Reinforcement Learning and Imitation Learning: A Tale of Pessimism